A small example to compare NNC with optimal control on a toy dynamical system.
# To edit source files and automatically refresh code
%load_ext autoreload
%autoreload 2
# Load custom modules path
import sys
sys.path.append('../../')
# Custom modules path
import nnc.controllers.baselines.ct_lti.dynamics as dynamics
import nnc.controllers.baselines.ct_lti.optimal_controllers as oc
from nnc.controllers.neural_network.nnc_controllers import\
NeuralNetworkController, NNCDynamics
# Computation and plot helpers for this example
from small_example_helpers import EluTimeControl, evaluate_trajectory, todf
from small_example_helpers import compare_trajectories, grand_animation
# progress bar
from tqdm.notebook import tqdm
# Other libraries for computing
import torch
import numpy as np
import pandas as pd
# ODE solvers with gradient flow
from torchdiffeq import odeint
# plots
import plotly
plotly.offline.init_notebook_mode()
First we define our dynamics as: \begin{equation}\label{eq:ld} \dfrac{dx}{dt} = \langle A, x^{\intercal} \rangle + \langle B, u^{\intercal}\rangle \end{equation} where: \begin{align} x &: \text{a state vector over $N$ nodes.}\\ A &: \text{an $N\times N$ interaction matrix, indicating nodal interactions.}\\ u &: \text{a control signal vector of $M\leq N$ independent control signals.}\\ B &: \text{an $N\times M$ driver matrix, indicating control signal effects on nodes.}\\ \end{align}
In this example we parametrize the system as:
Essentially, we evaluate control over a 2-node undirected graph with control signal applied over one node (single driver node).
# Basic setup for calculations, graph, number of nodes, etc.
dtype = torch.float32
device = 'cpu'
training_session = True
# interaction matrix
A = torch.tensor([
[0., 1.],
[1., 0.]
])
# driver matrix
B = torch.tensor([
[1.],
[0.]
])
# interaction matrix dimensions denote how many nodes we have in the network
n_nodes = A.shape[-1]
# column dimension implies the number of driver nodes.
n_drivers = B.shape[-1]
# implementing the dynamics
linear_dynamics = dynamics.ContinuousTimeInvariantDynamics(A, B, dtype, device)
The exact implementation of the dynamics:
dynamics.ContinuousTimeInvariantDynamics??
We aim to control the system from initial state $x(0)=(0.5, 0.5)$ to $x^* = (1,-1)$ within time $T=1$.
# total time for control
T = 1
# we evaluate two points in time, first point is matched to initial
# state and second one is matched to the terminal state
t = torch.linspace(0, T, 2)
# initial state is set as follows, but can be chosen arbirtarily:
x0 = torch.tensor([[
0.5, 0.5
]])
# same applies for target state:
x_target = torch.tensor([[
1, -1
]])
Here we use the minimum energy optimal control as baseline based on the following work:
For CT-LTI systems that satisfy controllability assumption, this baseline achieves the maximum theoretical performance, i.e. it cannot be surpassed by any other.
# baseline definition
oc_baseline = oc.ControllabiltyGrammianController(
alpha = A,
beta = B,
t_infs = T,
x0s = x0, # a batch with one sample
x_infs = x_target, # a batch with one sample
simpson_evals = 100,
progress_bar=None,
use_inverse=False,
)
Below, we create on line 2 a lambda expression function to make the dynamics compatible with required method signatue, and then we evolve the system with odeint method from torchdiffeq package (line 3).
The result of odeint is a tensor of shape [timesteps, state_size].
The reached state corresponds to the last index of the result [-1].
As we print on line 4, optimal control reached the target state after control $x(T)\approx x^*$.
timesteps = torch.linspace(0, T, 2)
oc_dynamics = lambda t,x: linear_dynamics(t, x, oc_baseline(t, x))
xT_oc = odeint(oc_dynamics, x0.unsqueeze(0), t=timesteps)[-1]
print(str(xT_oc.flatten().detach().cpu().numpy()))
We define a custom neural network for learning contro. In this example we pick a fully connected architecture with 2 layers of n+4 neurons and expotential linear unit activation. The content of the class can be found below:
EluTimeControl??
We define a training routine for the neural networks.
x0.detach() and t.detach() on lines 7,8 respectively to avoid unexpected gradient flows.dopri5 on line 8, but we evaluate with constant step size on line 16, to limit performance advantages due to step size selection.def train(nnc_dyn, epochs, lr, t, n_timesteps=40): #simple training
optimizer = torch.optim.Adam(nnc_dyn.parameters(), lr=lr)
trajectories = [] # keep trajectories for plots
for i in tqdm(range(epochs)):
optimizer.zero_grad() # do not accumulate gradients over epochs
x = x0.detach()
x_nnc = odeint(nnc_dyn, x, t.detach(), method='dopri5')
x_T = x_nnc[-1, :] # reached state at T
loss = ((x_target - x_T)**2).sum() # !No energy regularization
loss.backward()
optimizer.step()
trajectory = evaluate_trajectory(linear_dynamics,
nnc_dyn.nnc,
x0,
T,
n_timesteps,
method='rk4',
options=dict(step_size = T/n_timesteps)
)
trajectories.append(trajectory)
return torch.stack(trajectories).squeeze(-2)
We can now apply training, collect trajctories and save the model. Notice that we decrease learning rate on lines 11 and 14. NNC is very sensitive to learning rate and often requires several lr-adapaptions to converge to desirable performance. Such, adaptions can be automated as discussed in the paper.
n_timesteps = 40 #relevant for plotting, ode solver is variable step
linear_dynamics = dynamics.ContinuousTimeInvariantDynamics(A, B, dtype, device)
if training_session:
torch.manual_seed(0)
neural_net = EluTimeControl(n_nodes, n_drivers)
nnc_model = NeuralNetworkController(neural_net)
nnc_dyn = NNCDynamics(linear_dynamics, nnc_model)
# time to train now:
t = torch.linspace(0, T, n_timesteps)
t1 = train(nnc_dyn, 200, 0.1, t) # , 200 epochs, learning rate 0.1
df1 = todf(t1, lr=0.1)
t2 = train(nnc_dyn, 300, 0.01, t) # , 100 epochs, learning rate 0.01
df2 = todf(t2, lr=0.01)
torch.save(neural_net, 'trained_elu_net.torch')
alldf = pd.concat([df1, df2], ignore_index=True)
alldf.to_csv('all_trajectories.csv')
else:
neural_net = torch.load('trained_elu_net.torch')
alldf = pd.read_csv('all_trajectories.csv', index_col=0)
nnc_model = NeuralNetworkController(neural_net)
nnc_dyn = NNCDynamics(linear_dynamics, nnc_model)
Now we can check the reached for NNC, and as we observe it is close to the target. More epochs on lower learnig rates will further improve this result.
t = torch.linspace(0, T, 2)
x = x0.detach()
ld_controlled_lambda = lambda t, x_in: linear_dynamics(t, u=neural_net(t, x_in), x=x_in)
x_all_nn = odeint(ld_controlled_lambda, x0, t, method='rk4',
options = dict(step_size=T/n_timesteps))
x_T = x_all_nn[-1, :]
print(str(x_T.flatten().detach().cpu().numpy()))
Now we compare NNC to optimal control (OC) in terms of: (i) controlled trajectories (left) and (ii) total energy (right). As we see both methods are extremely close.
fig_comparison, _, _ = compare_trajectories(linear_dynamics,
oc_baseline,
nnc_model,
x0,
x_target,
T,
x1_min=-3,
x1_max=3,
x2_min=-1.5,
x2_max=1.5,
n_points=200,
)
fig_comparison
Finally we animate the learning effort of NNC through the training process. We observe how it slowly converges to the OC trajectory, both in terms of state values and energy.
animation = grand_animation(alldf,
linear_dynamics,
oc_baseline,
nnc_model,
x0,
x_target,
T,
frame_duration=50) # 50-100 should be ok.
animation
This notebook gave us a quick yet insightful demonstration of NNC and its performance compared to optimal control. Results conclude that NNC is a capable controlled and that it also is equipped with implicit energy regularization. Still, larger scale demonstrations on complex dynamics will follow to showcase more realistic scenarios, as done alredy in the paper.